A Rule Based Morphological Analyzer and A Morphological Disambiguator for Kazakh Language

نویسندگان

  • Gulshat Kessikbayeva
  • Ilyas Cicekli
چکیده

Morphological analysis is a very critical issue especially for natural language processing related tasks on agglutinative languages. This study gives the implementation details of a rule-based morphological analyzer of Kazakh language which is an agglutinative language. A detailed computational analysis of Kazakh language morphology such as formalization of alternation and morphotactic rules for Kazakh language is worked out in order to create the morphological analyzer. In the implementation of the morphological analyzer, alternation and morphotactic rules of Kazakh language are represented by two-level morphology rules and Foma finite state compiler is employed. This is the first detailed computational analysis of Kazakh language from morphological view. A word can have more than one morphological parse but only one of its morphological parses is valid in a given sentence. A morphological disambiguator disambiguates words by selecting one of possible parses of words. In this paper, we also present a transformation-based morphological disambiguator for Kazakh language and it is a variation of Brill tagger.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rule Based Morphological Analyzer of Kazakh Language

Having a morphological analyzer is a very critical issue especially for NLP related tasks on agglutinative languages. This paper presents a detailed computational analysis of Kazakh language which is an agglutinative language. With a detailed analysis of Kazakh language morphology, the formalization of rules over all morphotactics of Kazakh language is worked out and a rule-based morphological ...

متن کامل

A Rule-Based Morphological Disambiguator for Turkish

Part-of-speech (POS) tagging is the process of assigning each word of an input text into an appropriate morphological class. Automatic recognition of parts-of-speech is very important for high level NLP applications, since it would be usually infeasible to perform this task manually in practical systems. One approach to POS tagging uses morphological disambiguation which selects the most suitab...

متن کامل

Finite State Approach to the Kazakh Nominal Paradigm

This work presents the finite state approach to the Kazakh nominal paradigm. The development and implementation of a finitestate transducer for the nominal paradigm of the Kazakh language belonging to agglutinative languages were undertaken. The morphophonemic constraints that are imposed by the Kazakh language synharmonism (vowels and consonants harmony) on the combinations of letters under af...

متن کامل

پارس مورف: تحلیلگر صرفی زبان فارسی

In this paper, the theoretical foundation, the way of implementation and the uses of Pars Morph, a Persian morphological analyzer is introduced. Pars Morph is a rule-based Persian morphological analysis system, which analyzes the internal structure of word in Persian and determines the grammatical category and function of the word parts. Pars Morph being in link with a lexicon covering about 45...

متن کامل

Formal models of nouns in the Kazakh language

This paper explains how semantic hypergraphs are used to construct ontological models of morphological rules in the Kazakh language. The nodes within these graphs represent semantic features (morphological concepts) and the edges within represent the relationships between these features. Word forms within the hypergraph structure are described in trees which are converted into linear parenthesi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015